Acoustic Segmentation for Audio Browsers

نویسندگان

  • Don Kimber
  • Lynn Wilcox
چکیده

Online digital audio is a rapidly growing resource, which can be accessed in rich new ways not previously possible. For example, it is possible to listen to just those portions of a long discussion which involve a given subset of people, or to instantly skip ahead to the next speaker. Providing this capability to users, however, requires generation of necessary indices, as well as an interface which utilizes these indices to aid navigation. We describe algorithms which generate indices from automatic acoustic segmentation. These algorithms use hidden Markov models to segment audio into segments corresponding to di erent speakers or acoustics classes (e.g. music). Unsupervised model initialization using agglomerative clustering is described, and shown to work as well in most cases as supervised initialization. We also describe a user interface which displays the segmentation in the form of a timeline, with tracks for the di erent acoustic classes. The interface can be used for direct navigation through the audio.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Model Selection Criteria for Acoustic Segmentation

Robust acoustic segmentation has become a critical issue in order to apply speech recognition to audio streams with variable acoustic content, e.g. radio programs. Many techniques in the literature base segmentation on statistical model selection, by applying the Bayesian Information Criterion. This work reviews alternative model selection criteria and presents comparative experiments both unde...

متن کامل

On Building and Evaluating a Broadcast-News Audio Segmentation System

Audio segmentation is useful in diverse applications like audio indexing and retrieval, subtitling, monitoring of acoustic scenes, etc. Also, an initial audio segmentation stage may help to improve the robustness of speech technologies like automatic speech recognition and speaker diarization. In this paper, firstly, the Albayzín-2010 audio segmentation evaluation is reported, including some co...

متن کامل

Content-free Topic Segmentation with Acoustic Features (Report)

In my previous work, content-free topic segmentation is approached by classification methods, and the unit is Vocalization [6]. Speaker ID, vocalization start time, vocalization duration, pause, overlaps and their corresponding Horizon features are emphasized. This followed an approach to segmentation and classification introduced by Luz [2, 3] for analysing recordings of multidisciplinary medi...

متن کامل

Semantic Multi-modal Analysis, Structuring, and Visualization for Candid Personal Interaction Videos

Videos are rich in multimedia content and semantics, which should be used by video browsers to better present the audio-visual information to the viewer. Ubiquitous video players allow for content to be scanned linearly, rarely providing summaries or methods for searching. Through analysis of audio and video tracks, it is possible to extract text transcripts from audio, displayed text from vide...

متن کامل

An Unsupervised Model of Infant Acoustic Speech Segmentation

There is a long standing hypothesis in Developmental Psychology that children use statistical information to segment acoustic speech streams into words. Additionally, several experiments have demonstrated that infants are able to find word breaks using distributional cues. In this paper we propose an algorithm for the unsupervised segmentation of audio speech, based on the Voting Experts (VE) a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996